The main objective of this paper is to map out and visualize the Covid cases, deaths in the United States and NY state by County based on data collected as of June 8th, 2020.
Firstly, we are loading the County Spatial data using readOGR function and Covid Cases csv file as of 0608 date. For this analysis, we are only considering the covid cases as of latest date and filtering the data for 2020-06-08 date. In order to merge Covid cases data and Counties, we enriched the fips(Covid data) and CNTYIDFP(from County data) as per need.
setwd("~/Downloads/HU/2020Summer/DataViz512/5_0606V/gisData")
counties = readOGR(dsn=".",layer="cb_2016_us_county_500k")
## OGR data source with driver: ESRI Shapefile
## Source: "/Users/akhilasaineni/Downloads/HU/2020Summer/DataViz512/5_0606V/gisData", layer: "cb_2016_us_county_500k"
## with 3233 features
## It has 9 fields
## Integer64 fields read as strings: ALAND AWATER
covid = read.csv("us-counties_0608.csv")
covid = covid[covid$date == '2020-06-08',]
covid$fips = as.character(covid$fips)
covid$fips = ifelse(nchar(covid$fips) == 4, paste0("0",covid$fips), covid$fips)
summary(covid)
## date county state fips
## 2020-06-08:3011 Washington: 30 Texas : 235 Length:3011
## 2020-01-21: 0 Jefferson : 26 Georgia : 160 Class :character
## 2020-01-22: 0 Unknown : 26 Virginia: 130 Mode :character
## 2020-01-23: 0 Franklin : 25 Kentucky: 119
## 2020-01-24: 0 Jackson : 23 Missouri: 110
## 2020-01-25: 0 Lincoln : 23 Illinois: 101
## (Other) : 0 (Other) :2858 (Other) :2156
## cases deaths
## Min. : 0.0 Min. : 0.00
## 1st Qu.: 13.0 1st Qu.: 0.00
## Median : 53.0 Median : 1.00
## Mean : 654.5 Mean : 36.85
## 3rd Qu.: 234.5 3rd Qu.: 8.00
## Max. :212122.0 Max. :21356.00
##
head(covid)
## date county state fips cases deaths
## 215617 2020-06-08 Autauga Alabama 01001 273 5
## 215618 2020-06-08 Baldwin Alabama 01003 335 9
## 215619 2020-06-08 Barbour Alabama 01005 198 1
## 215620 2020-06-08 Bibb Alabama 01007 82 1
## 215621 2020-06-08 Blount Alabama 01009 75 1
## 215622 2020-06-08 Bullock Alabama 01011 240 9
counties$CNTYIDFP<-paste0(counties$STATEFP,counties$COUNTYFP)
merge<-merge(counties, covid, by.x ="CNTYIDFP", by.y ="fips")
To conduct this analysis we used expss functions to create baseline statistical tabulations of Covid positive cases and deaths reported for each state. The highest number of Covid positive cases and deaths are in New York state 383,591 and 30239 respectively.
data = apply_labels(covid,
cases="Total COVID 19 Cases",
deaths="Total COVID 19 Deaths",
county="County Name",
state="State Name"
)
data %>%
tab_cells(cases, deaths) %>%
tab_cols(total(label = "#Total| |"), state) %>%
tab_stat_fun(TotalCases=w_sum, method=list) %>%
tab_pivot() %>%
tab_transpose()
| Total COVID 19 Cases | Total COVID 19 Deaths | |
|---|---|---|
| #Total | ||
| 1970613 | 110966 | |
| State Name | ||
| Alabama | 20925 | 718 |
| Alaska | 607 | 8 |
| Arizona | 27761 | 1052 |
| Arkansas | 9740 | 155 |
| California | 134287 | 4679 |
| Colorado | 28169 | 1543 |
| Connecticut | 44092 | 4084 |
| Delaware | 9972 | 398 |
| District of Columbia | 9389 | 491 |
| Florida | 64896 | 2711 |
| Georgia | 49995 | 2176 |
| Guam | 1149 | 6 |
| Hawaii | 664 | 17 |
| Idaho | 3197 | 83 |
| Illinois | 128819 | 5964 |
| Indiana | 38553 | 2316 |
| Iowa | 22111 | 623 |
| Kansas | 10724 | 237 |
| Kentucky | 11701 | 487 |
| Louisiana | 43163 | 2944 |
| Maine | 2588 | 99 |
| Maryland | 59024 | 2776 |
| Massachusetts | 103626 | 7353 |
| Michigan | 64911 | 5916 |
| Minnesota | 28235 | 1208 |
| Mississippi | 17768 | 837 |
| Missouri | 15159 | 832 |
| Montana | 548 | 18 |
| Nebraska | 15752 | 196 |
| Nevada | 9815 | 442 |
| New Hampshire | 5079 | 286 |
| New Jersey | 164497 | 12214 |
| New Mexico | 9062 | 400 |
| New York | 383591 | 30239 |
| North Carolina | 36581 | 1041 |
| North Dakota | 2883 | 75 |
| Northern Mariana Islands | 28 | 2 |
| Ohio | 38837 | 2404 |
| Oklahoma | 7206 | 348 |
| Oregon | 4925 | 164 |
| Pennsylvania | 80432 | 6007 |
| Puerto Rico | 5046 | 142 |
| Rhode Island | 15642 | 799 |
| South Carolina | 14800 | 557 |
| South Dakota | 5471 | 65 |
| Tennessee | 27217 | 417 |
| Texas | 77326 | 1856 |
| Utah | 12378 | 124 |
| Vermont | 1075 | 55 |
| Virgin Islands | 71 | 6 |
| Virginia | 51251 | 1477 |
| Washington | 25593 | 1168 |
| West Virginia | 2161 | 84 |
| Wisconsin | 21161 | 650 |
| Wyoming | 960 | 17 |
We created a leaflet map showing the total number of Covid positive cases in the United States, which is as follows. The label shows name of the county and state along with number of covid cases and deaths per county with C and D prefixed respectively.
pal = colorQuantile("Reds", covid$cases, n = 9)
leaflet(merge) %>% setView(-98,39, zoom=4) %>%
addPolygons(weight=.10, color="blue",fillOpacity = .2, fillColor = ~pal(cases),
label= paste(merge$NAME, ",", merge$state, ":", "C", merge$cases, "D", merge$deaths))